Discovering missed synonymy in a large concept-oriented Metathesaurus

نویسندگان

  • William T. Hole
  • Suresh Srinivasan
چکیده

The Unified Medical Language System (UMLS) [1, 2] Metathesuarus is concept-oriented; its goal is to unite all names with identical meaning in a single Concept. The names come from its constituent vocabularies or "sources"--a wide variety of biomedical terminologies including many controlled vocabularies and classifications used in patient records, administrative health data, bibliographic, research, full-text, and expert systems. Many offer little definitional information, and many are not themselves concept-oriented, so identifying synonymy is a challenging semantic task [3]. The rapidly increasing size of the Metathesaurus makes the task daunting, demanding effective computational support; there are more than 1.5 million names for 730,000 concepts in the January 2000 release. Vocabularies are added and updated using sophisticated lexical matching, selective algorithms, and expert review [4, 5, 6]. Yet the result is imperfect; we have discovered and corrected missed synonymy in approximately 1% of previously released concepts each year. This paper reviews general methods for finding missed synonymy and describes several specific novel approaches which we have found effective.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Tracking meaning over time in the UMLS Metathesaurus

The Unified Medical Language System(R) (UMLS) Metathesaurus contains records arranged by concept or meaning. Each concept contains a unique identifier (CUI) that can be used to track the concept over time. Since the January 2001 release, the Metathesaurus has included the file MRCUI that contains mappings for CUIs that disappear. This paper describes the processes that facilitated this effort a...

متن کامل

Concepts and Synonymy in the UMLS Metathesaurus

This paper advances a detailed exploration of the complex relationships among terms, concepts, and synonymy in the UMLS Metathesaurus, and proposes the study and understanding of the Metathesaurus from a model-theoretic perspective. Initial sections provide the background and motivation for such an approach, and a careful informal treatment of these notions is offered as a context and basis for...

متن کامل

Quantifying the Impact and Extent of Undocumented Biomedical Synonymy Supporting Information

Consistent with previous observations [1, 2, 3], we noticed that many of the terms contained within the UMLS Metathesaurus were inappropriate for natural language-oriented analyses (ex: database-specific encodings, machine permutations, non-English language entries, etc.). Therefore, prior to generating the terminologies utilized in this study, we subjected the Metathesaurus to a thorough, rule...

متن کامل

Increasing UMLS Coverage and Reducing Ambiguity via Automated Creation of Synonymous Terms: First Steps toward Filling UMLS Synonymy Gaps

Background: Although extensive synonymy is one of the greatest strengths of the UMLS Metathesaurus, much research has nonetheless focused on identifying and measuring gaps in UMLS synonymy. This paper proposes a methodology for further extending the UMLS’ already rich synonymy by semi-automatically creating new strings not in the UMLS, and including them as additional synonymous strings within ...

متن کامل

Battling Scylla and Charybdis: the search for redundancy and ambiguity in the 2001 UMLS metathesaurus

I previously developed methods for identifying cases of multiple synonymous concepts (redundancy) and concepts with multiple meanings (ambiguity) and applied them to the 1995 UMLS Metathesaurus. These methods use semantic approaches (including knowledge about word synonymy and the semantic types assigned to concepts) to complement the standard lexical approaches. In this paper, I describe the r...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Proceedings. AMIA Symposium

دوره   شماره 

صفحات  -

تاریخ انتشار 2000